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Abstract 

A version of quantum theory is derived from a set of plausible assumptions 
related to the following general setting: For a given system there is a set of 
experiments that can be performed, and for each such experiment an ordinary 
statistical model is defined. The parameters of the single experiments are func- 
tions of a hyperparameter, which defines the state of the system. There is a 
symmetry group acting on the hyperparameters, and for the induced action on 
the parameters of the single experiment a simple consistency property is as- 
sumed, called permissibility of the parametric function. The other assumptions 
needed are rather weak. The derivation relies partly on quantum logic, partly 
on a group representation of the hyperparameter group, where the invariant 
spaces are shown to be in 1-1 correspondence with the equivalence classes of 
permissible parametric functions. Planck's constant only plays a role connected 
to generators of unitary group representations. 
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1 Introduction. 



The two great revolutions in physics at the beginning of this century - relativity 
and quantum mechanics - still influence nearly all aspects of theoretical physics. 
Similar as they may be, both in their impact on modern science and in the way they 
in their time turned conventional ideas upside down, there are also of course great 
differences - both in origin, appearance and type of content. Relativity theory was 
founded by one man, Einstein, while the ideas of quantum theory developed over 
time through the work of many people, most notably Planck, Bohr, Schrodinger, 
Heisenberg, Pauli and Dirac. An equally important - and perhaps related - aspect, 
is the following: Relativity theory can be developed logically from a few intuitively 
clear, nearly obvious, concepts and axioms, essentially only constancy of the speed 
of light and invariance of physical laws under change of coordinate system, while 
quantum theory still has a rather awkward foundation in its abstract concepts for 
states and observables. 

During the years, several attempts of a deeper foundation of quantum theory 
have been made; some of these we will return to later. Although there is far from a 
universal agreement on the foundation, today's physicists, both theoretical and ex- 
perimental, have developed a clear intuition directly connected to states of a system 
as rays in a complex Hilbert space and observables as self adjoint operators in the 
same space. The theory has had success in very many fields - some claim that quan- 
tum theory is the most succcssfull physical theory ever advanced, but it has also met 
problems: Difficulties with defining the border between object and observer in von 
Neuman's quantum measurement theory; difficulties with interpretations requiring 
many worlds or action at a distance; infinities in quantum field theory requiring com- 
plicated renormalization programs; difficulties in reconciling the theory with general 
relativity and so on. We will of course not try to attack all these problems here. 
What we will assert, though, is that the fact that such difficulties occur, does make 
it legitimate to look at the foundation of the theory with fresh eyes again. One way 
to do this, is to try to find a foundation which is in accordance with common sense. 
Another way is to compare it with another, apparently unrelated, theoretical area. 
In this paper we will try to combine both these lines of attack. 

A vital clue is the role of probability in quantum theory. In the beginning, this 
was an aspect that overshadowed all other difficulties in the theory, and that made 
leading physicists - first of them Einstein himself - sceptical: The new physical laws 
were then and are still claimed to be probabilistic by necessity. Still some people are 
looking at hidden variable theories in attempts to avoid the fact that the fundamental 
laws of nature are stochastic, but after the experiments of Aspect et al. (1982) and 
other overwhelming evidence, most scientists seem to accept stochasticity of nature 
as an established fact. 
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At the same time, statistical science has developed methodology that has found 
applications in an increasing number of empirical sciences, methodology largely based 
upon stochastic modelling. With this background, the obvious, but apparently very 
difficult question then is: Why has there been virtually no scientific contact between 
physicists and statisticians throughout this century? This lack of contact is in fact 
very striking. At the same time as Dirac was developing a foundation for relativis- 
tic quantum theory in Cambridge in England, R.A. Fisher completely independently 
developed the foundation of statistical inference theory based upon probability mod- 
els in Rothamsted and London. While modern quantum field theory was developed 
by Feynman in Princeton and Schwinger in Harvard, J. Neyman and coworkers lay 
the foundation of modern statistical inference theory in Berkeley. One of the few 
early contacts that I know of is Feynman's (1951) Berkeley Symposium paper on 
the interpretation of probabilities in quantum mechanics. Today, quantum theory 
sometimes has its own session at large international statistical conferences, but the 
language spoken there is of a nature which is difficult to understand for ordinary 
statisticians. Meyer's (1993) book on non-commutative probability theory may be 
seen as an attempt to make a synthesis of the two worlds, but this book does not 
address the foundation question if we seek a foundation related to common sense. It 
is also becoming increasingly apparent that there are similarities between advanced 
probability theory and quantum theory, but these similarities seem to be mostly at 
the formal level. 

The lack of a common ground for modern physics and statistics is even more 
surprising when we know that the outcome of any single experiment with fixed 
experimental arrangement can always be described by ordinary probability models, 
also in the world of particle physics. It is in cases where several arrangements are 
possible, as when one has the choice between measuring position and momentum of a 
particle, that quantum mechanics gives results which cannot be reached by ordinary 
probability theory. In addition, quantum theory gives definite rules for computing 
probabilities, also in cases where the ordinary probability concept can be used in 
principle. 

We will formulate below an extended experimental setting, which includes the 
possible decisions which must be made before the experiment itself is carried out, at 
least before the final inference is made. We will look at the situation where strong 
symmetries exist both within the single experiments and in the wider experimental 
setting between the single experiments. This, together with other reasonable as- 
sumptions, will lead to probability models of the type found in ordinary quantum 
mechanics. 

The physical implications of this theory, at least in its non-relativistic variant are 

not expected to differ considerably from existing quantum theory. Questions related 
to mathematical equivalence will be discussed later. An interesting further challenge 
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may be the question of finding a corresponding relativistic theory. This issue will not 
be pursued here, but in light of the importance of symmetry groups in high energy 
physics, it seems very plausible that it should be possible to develop the approach 
in this direction, too. A brief discussion on this will be given in the last section. 

Our aim is to write the paper in such a way that the principal ideas can be ap- 
preciated both by theoretical physisists, mathematicians and statisticians. Technical 
details at several points are unavoidable, however. Also, even though we will try to 
be fairly precise, at least in the main results, there may still be room for improve- 
ment in mathematical rigor. What we feel are the most important assumptions, are 
stated explicitly. Minor technical assumptions are stated in the text. 

The discussion of this paper is continued in HcUand (1999), where we among other 
things give a more explicit construction of the basic Hilbert space, a construction 
which is independent of the apparatus of quantum logic. 



2 Experimental setting. 

The common statistical framework for analyzing an experiment is a sample space X, 
listing the possible experimental outcomes, a fixed a-algebra (Boolean algebra) 
of subsets of X, and a class {Pe;d G 6} of probability measures on the measurable 
space {X,J-). The parameter 9 - or a function of this parameter - is ordinarily the 
unknown quantity which the statistician aims at saying something about using the 
outcome of the experiment. A fixed 9, or alternatively, a probability distribution 
expressing prior knowledge about 9, may also be related to the phycisist's concept of 
'state'. A simple purpose of a statistical experiment might be to estimate 9, which 
formally means to select a function 9 on the sample space X such that 9{x) is a 
reasonable estimate of 9 when the observation x is given. There is a considerable 
literature on statistical inference; three good and thorough books with different 
perspectives are Berger (1985), Lehmann (1983) and Cox and Hinkley (1974). Both 
at the more specialized and at the more elementary level there are very many books, 
of course. An important point is that intuition related to statistical methodology in 
this ordinary sense has been developed in a large number of empirical sciences, also 
in parts of experimental physics. 

A very general approach to statistics assumes a decision theoretical framework: 
First, a space D of possible decisions is defined, given the experimental outcome; for 
instance, D can consist of different estimators of 9. Then a loss function L{9, d{-)) is 
specified, giving the loss by taking the decision d when the true parameter is 9. The 
choice of decision is typically done by minimizing the expected loss. By restricting 
the class of decision functions so that they posess invariance properties, or so that 
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estimators have the correct expectation, this can often be done uniformly over the 
unknown parameter. Another possible course - gaining increasing popularity - is 
to assume a prior probability distribution over the parameter space, and then base 
inference on the corresponding posterior distribution, given the data. This is called 
the Bayesian approach. 

Essential for what follows, is that we will extend the traditional statistical frame- 
work to include possible actions taken by an experimenter before or during a given 
experiment. The most important actions for our purposes are those that label the 
whole experiment, and thus allows different experiments to be done in the same 
situation. However, we will also allow actions that change the class of probability 
measures, the space of decisions and/ or the loss function. 

Thus we start with a space A of possible actions, and for each a G ^ we have 
an experiment consisting of a probability model (Af^, J^j, {P^; G ©a}); and 
possibly a loss function La{-, •) and a space Da of potential decisions. In macroscopic 
experiments such actions are in fact very common. They are not usually explicitly 
taken into account in the statistical analysis, but are taken as fixed once and for all, 
an attitude which is fairly obvious in some cases, but absolutely can be discussed in 
other cases. In fact, more can be said, also in the ordinary statistical setting, for a 
closer link between the experimental design phase (choosing a) and the statistical 
analysis phase. A partial list of possibilities for choosing a include: 

(a) Choice between a number of essentially different experiments that are possible 
to perform in a given situation. 

(b) Choice of target population, and way to select experimental units, including 
choice of randomization. Choice of conditioning in models is related to these issues. 

(c) Choice of treatments. Here are many variants: There are lots of examples of 
medical treatments which are mutually exclusive. In factorial experiments the choice 
of factors and the levels of these arc important issues. 

(d) A choice between different statistical models; for instance, one may want to 
reduce the number of parameters if the model is to complicated to give firm decisions. 
In fact, this is permissible under certain conditions. 

One main purpose of this paper will be to explore the relationship between macro- 
scopic statistical modelling and the microscopic modelling we find in the quantum 
mechanical world. In the latter one definitively has the choice between performing 
different experiments, say, measuring position or moment or measuring spin in differ- 
ent directions. The quantum mechanical state of a particle or a particle system can 
be used to predict the outcome of any given of these experiments, once the choice is 
made. In this way one might say that the quantum mechanical state vector contains 
the simultaneous model of a large set of possible statistical experiments ^a(fl € A). 
A major reason why this is possible, is that the situation contains a high degree of 
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symmetry. To study the connection between quantum mechanics and statistics more 
directly, it turns out, however, to be useful first to consider a simpler representation 
than the ordinary Hilbert space representation of quantum mechanics. 



3 The lattice approach to quantum theory. 

Mathematically, a cr-lattice is defined as a partially ordered set L such that the infi- 
mum and supremum (with respect to the given ordering) of every countable subset 
exist and belong to L. In this section we will summarize the approach to quantum 
mechanics taking lattices of propositions as points of departure. These can be looked 
upon as generalizations of the cr-algebras of classical probability, where the propo- 
sitions are subsets of a given space Af, ordered by set inclusion; the term Boolean 
algebra is sometimes used for the same concept. For Boolean algebras, the par- 
tial ordering corresponds to set inclusion, and infimum and supremum corresponds 
to intersection and union, respectively. In the following we will mainly consider 
(complete) lattices, where the infimum and supremum of all subsets, not only the 
countable ones, exist. 

The lattice approach to quantum mechanics consists of formulating a number of 
axioms for a lattice which is weaker than the set of axioms needed to define a Boolean 
algebra, just sufficiently weak that they are satisfied by the set of projections in a 
Hilbert space, which are the entities that represent propositions in conventional 
quantum mechanics. We will follow largely Beltrametti and Cassinelli (1981) in 
presenting these axioms. It is relatively easy to prove that the axioms below actually 
are satisfied by Hilbert space projections, much more difficult to show that the 
axioms by necessity imply that the lattice has a Hilbert space representation. We 
proceed with the necessary definitions; in the next section we will try to relate these 
definitions to sets of potential experiments as formulated above. 

We have already defined a lattice £ as a set of propositions {"P} with a partial 
ordering < such that the supremum Vi exists and belongs to £ when all the Vi 
belong to £, similarly for infimum. (Vj^i is defined as a proposition V such that 
Vi<V for all Pj, and such that Vi < Vq for all Vi implies V <Vo. It is easy to show 
that the supremum, if it exists, is unique if the lattice is assumed to have the property 
that Vi < V2 and V2 < "^i implies Vi = V2- Infimum is defined correspondingly.) 

It follows that C contains the infimum of all propositions, denoted 0, and the 
supremum of all propositions, denoted 1. 

What corresponds to the complement in a Boolean algebra, is here denoted by 
orthocomplement. An orthocomplementation in a lattice £ is a mapping V —>■ of 
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L onto itself such that (i) V^^ = V, (ii) Pi < V2 imphes < ^1 > ("i) P A7^-^ = 
and (iv) P V P^ = 1. It is easy to see that De Morgan's laws follow from (ii): 

{\Jv,)^ = Kvt, {KPi)^ = \lPt- 

Two propositions Pi and P2 are said to be disjoint or orthogonal, written Pi 1. P2 
when Pi < P2 (or, equivalently, when P2 < Pi)- A subset of C formed by pairwise 
disjoint elements is simply called a disjoint subset. The lattice C is called separable 
if every disjoint subset of C is at most countable. 

If Pi < P2, we will write P2 — Pi for P2 A Pj*-. It is clear that we then have 
P2 — Pi -i- Pi- A lattice £ is called orthomodular if Pi < P2 implies P2 = Pi \/ 
{P2 — Pi)- One other way to put this, is that the distributive laws hold for the triple 
{Pi,P2,Pi) when Pi<p2. 

A much stronger requirement is that the distributive laws should hold for all 
triples: 

Pi V {P2 A P3) = (Pi V P2) A {Pi V P3), Pi A {P2 V P3) = {Pi A P2) V {Pi A P3)- 

An orthocomplcmented distributive lattice is in general a Boolean algebra, and can 
always be realized as an algebra of subsets of a fixed set. 

The lattices of quantum mechanics are orthocomplcmented, but nondistributive; 
they only satisfy the weaker requirement of being orthomodular. Much of the pio- 
neering work in the lattice approach to quantum mechanics is due to Mackcy (1963). 
Two books on orthomodular lattices arc Bcran (1985) and Kalmbach (1983). This 
represents an approach to quantum mechanics that in some sense is more primitive 
than most other approaches, but there are still concepts here, like the lattice prop- 
erty or orthomodularity, which are difficult to get an intuitive relation to. Our aim 
here will be to try to understand these concepts to some extent in terms of ordi- 
nary statistical models for a set of potential experiments and in terms of symmetry 
properties. 

In addition to orthomodularity, two further properties are required by Beltram- 
etti and Cassinclli (1981): The lattices are assumed to be atomic and to have the 
covering property. 

A nonzero element Po of £ is called an atom if < P < Pq implies P = ot 
P = Pq. The lattice C is called atomic if there always for P 7^ exists an atom Pq 
such that Pq < P. It can be shown that if C is an orthomodular, atomic lattice, 
then every element of C is the union of the atoms that it contains. If the lattice is 
separable, this union is at most countable. 

We say that £ has the covering property if for every Pi in £ and every atom Pq 
such that Pi APo = 0, we have that Pi <P2 <Pi^ Po implies either P2 = Pi or 

P2 = Piy Po- 
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In total, the kind of propositional structure which is promoted in Beltrametti 

and Cassinelh (1981) as a basis for quantum mechanics is an orthocomplcmcntcd, 
orthomodular, separable lattice which is atomic and has the covering property. All 
these requirements are proved to hold for the lattice of projection operators in a 
separable Hilbert space, or equivalently, the lattice of all closed subspaces of the 
Hilbert space. 

One can also prove results in the opposite direction, though this is much more 
difficult: If the given requirements hold for some lattice, then one can construct 
an isomorphic Hilbert space in the sense that the projections upon subspaces of 
this Hilbert space are in one-to-one correspondence with the propositions of the 
lattice, with corresponding ordering. The proof of this last result is only hinted at 
in Beltrametti and Cassinelli (1981); more details are given in Piron (1976) and in 
Maeda and Maeda (1970). Related results can also be found in Varadarajan (1985). 
One problem is to convince oneself that the complex number field is the natural to 
choose as a basis for the Hilbert space: One can also construct representations based 
on the real or quaternion number field. 

Finally, the concept of state in Quantum mechanics can be defined as a probabil- 
ity measure on the propositions of a lattice. In the Hilbert space representation the 
famous theorem of Gleason says that all states can be represented by density opera- 
tors p (positive operators with trace tr(/9) = 1) in the sense that the expectation of 
every observator, represented by a selfadjoint operator A, is given by 

ii{Ap). 

As is wcllknown from probability theory, the set of expectations for all variables 
determines the set of probability distributions for all variables. 



4 A set of potential statistical experiments. 

Let us start by turning to a completely different situation where the concept of 
'state' is also being used. Consider a medical patient. One way to make precise 
what is meant by the state of this patient is to contemplate the potential results 
of all possible tests that the patient can be exposed to, where the word 'test' is 
used in a very wide sense, possibly including treatments or parts of treatments. 
Thus in a concrete setting like this we can imagine a large number of potential 
experiments, some possibly mutually exclusive, and let the state be defined abstractly 
as the totality of probability distributions of results from these experiments, or some 
parameter determining all these probability distributions. A similar concept can be 
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imagined for a set of several medical patients, where randomization and allocation of 
treatment may be included as part of the potential experiments under consideration. 
Here, the focus may be on the treatments rather than on the patients, but still the 
results of the experiments depend upon the state of the patients, and - if we consider 
a large enough collection of experiments for a large enough collection of patients in 
the same state - the potential outcomes may hopefully determine these states. 

In general now, fix some concrete experimental setting, and let ^ be a set of 
potential experiments, in statistical terminology £a = .Fa, {P^; ^ G ©a}) with 
decision space Da and loss function La{9, d{-)) for a & A. 

For two experiments £a and Ea' it may be crucial whether or not these can both 
be performed on the experimental unit(s) - say the patient - in such a way that one 
experiment does not disturb the result of the other. There are lots of examples where 
such disturbance takes place, or where even one experiment may preclude the other: 
Biopsy of a possible beginning tumour may make the evaluation of a simple medical 
treatment difficult; a psychiatric patient may be treated by psychopharmica or by 
classical psychoanalysis, but the evaluation of both approaches on the same patient 
may be impossible. In other settings, say factorial experiments, similar phenomena 
occur: An industrial experiment with a fixed set of units may be performed with 
one given set of factors or another set, not orthogonal to the first one, but including 
both sets will lead to a different experiment. The effect of nitrogen in some fertilizer 
may be evaluated in a small fixed experiment with or without potassium present, 
not both. 

We will assume that for any pair a and a' it is always possible to decide whether 
or not these experiments can be performed simultaneously without disturbing each 
other. If this is the case, we say that the experiments are compatible. Two compatible 
experiments can always be joined into a compound experiment by taking cartesian 
product: £a ® 4' = (<^a, -^a, {PI: G G,}) ® {X,,, , T^' , {Pg' ; 9 e e„/}), similarly for 
larger sets of experiments. Sometimes the parameter set for the joint experiment 
can be simplified in such cases. 

In fact we will assume that all potential experiments that can be performed on 
a given set of units depend on a common (multidimensional) parameter (/>, defined 
on a space and connected to the state of the units. Later we will show that 
it may be natural instead to associate the state concept to a distribution over 
but everything that is said below, in particular the ordering of propositions, can be 
repeated with states as probability measures. So to keep things reasonably simple, 
we will keep ^ fixed in this discussion. (In fact we will do so until Section 10 below.) 

Then 6 in the experiment £a is a function of the common parameter 0, say, 
Q = da{4>)': it is assumed that the parameter spaces $ and &a are equipped with 
(T-algebras and that each function 6 = 9a{(f)) is measurable. Furthermore, it is 
convenient to assume that each parameter is identifiable: For each a and pair (9, 9'), 
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if P^{E) = P^,{E) for all E G then = 6'. 

Since c-algebras may be coarsened (data reduction), there is a natural partial 
ordering between the experiments: Say that < £a if Xa' = '^a and J^a' ^ ^a- Then 
the probability models are assumed to be consistent: Qa' C 0^ and = P()\t^i for 
9 G ©a'. We will let Aq be the extreme set of experiments on a given set of units, so 
that for £a with a ^ Aq there is no Sa' 7^ £a such that £a < £a' ■ 

With a proposition V we mean an experiment together with an event from this 
experiment: V = {a,Ea), where Ea belongs to the a-algebra J^a- 

A partial ordering of the propositions from the same experiment is first defined 
as the obvious one: We say that (a, Eia) < (o, £^20) if -E'la C E2a- This ordering will 
be generalized to some pairs of propositions from different experiments: 



Definition 1. 

We say that Vi = {ai,Eia,) < r2 = (a2,£^2aj iff P^l^^^^iE^a,) < P^^J^ (£^2a J 
for all state parameters (j). 

In order that Vi < V2 and V2 < Vi together shall imply Vi = V2, we will 
here identify Vi and V2 if PqI ^^){Eiai) = PqI (</>)(^2a2) for ah 4>. This may lead 
to somewhat unfortunate situations where unrelated propositions are identified, but 
mathematically it is convenient. Among other things it is necessary in order that 
supremums shall be unique. 

This definition includes, but may in certain cases also be an extension of, the 
trivial one when the experiments are the same. Other cases where V\ < V2 include: 
(i) E'lai = 0; (ii) -^2a2 = -^02! (iii) ^ai is the cartesian product of £a2 and another 
compatible experiment £"^3 , so £'1^1 = -E'2a2 ^ ^3a3 • 

From (i) and (ii) it follows that we have to identify all propositions of the form 
(a, 0) - these will be collected in the single proposition 0, and the propositions of the 
form {a,Xa), which will be collected in the single proposition 1. These will be the 
infimum and supremum, respectively, of the whole set of propositions. 

The orthocomplement of a proposition is defined in the straightforward way from 
the complement of an event: For V = {a,Ea) take P"*" = (a,E^). It is then clear 
that P-L-L = V and that Vi < V2 implies <Pi- Prom the results of the next 
section it follows that V A = and V V = 1. Thus the properties of an 
orthocomplemcntation are satisfied. 

Sets of propositions from the same experiments will ordinarily have a natural 
supremum and infimum, corresponding to the usual unions and intersections. To 
introduce supremum and infimum for arbitrary sets of proposition, however, one 
needs more structure. We will in fact add more structure by making symmetry 
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assumptions. But first we will show that the simple structure of sets of experiments 
under weak extra assumptions implies that the set of propositions always will be an 
orthocomplemented, orthomodular poset. 



5 Orthomodular propositions from experiments. 

We start with an assumption which seems rather weak in the general setting we are 
considering, but also as it stands seems difficult to motivate directly from statistical 
reasoning. It is closely related to Axiom V in Mackey (1963), and has been formu- 
lated again in several papers in quantum logic. Mciczyhski (1973) showed that this 
assumption is necessary and sufficient in order that an orthocomplemented poset 
shall possess some natural properties. 

We say that k propositions Vi, . . . ^Vk with Vi = {ai,Ei) are orthogonal if the 
inequality 

1=1 

holds for all (p. We say that these propositions are pairwise orthogonal if the same 
inequality holds for any pair chosen from the k propositions. This is equivalent to 
Vi < Vj' for all pairs. 

It is clear that orthogonality implies pairwise orthogonality. The opposite impli- 
cation holds for most other orthogonality concept, but here statistical examples can 
easily be constructed for which this does not hold. Such cases will be explicitly ex- 
cluded from our sets of propositions (strictly speaking, we only need this assumption 
for A; = 3 here.) 



Assumption 1. 

For the propositions under consideration, any set of k pairwise orthogonal propo- 
sitions is orthogonal. 

For events Ei from the same experiment, pairwise orthogonality essentially means 
EiHEj = for i ^ j. This obviously implies orthogonality in the sense given by (||). 
For general sets of proposition, Assumption 1 amounts in some sense to assuming a 
certain richness of the set of models. 

For the intermediate Lemma below, we may also need to extend the set of exper- 
iments under consideration. This can always be done artificially, since there are no 
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limitations on the set of experiments under consideration, in particular, the set of 
experiments need not be countable. We will see later how the artificial experiments 
can be deleted when final results are formulated. 



Assumption 2. 

Let Vi = {ai, Ei) (i = 1, . . . ,k) be k propositions such that (jjj holds for all (/). 
Then there is an experiment 8a' = {'^a' ■:^a' -.{Pq ;^ € Qa'}) o,f^d k pairwise disjoint 
events E[ in Ta' such that PgurAE'j) = Pg\,JEi) for all (j) ^ ^ and i = 1, . . . ,k. 



In the proof of the Lemma below, we will only make use of Assumption 2 for 
A; = 3, to begin with only for k = 2. 

Lemma 1. 

Let Vi = {ai,Ei) andV2 = {0-2, E2) he two orthogonal propositions in the above 
sense, i.e., 

Then under Assumption 1 and Assumption 2 there exists a proposition V = (a, E) 
on some experiment 8a such that 

Poid^) = Peli^Ei) + PudE2), (2) 
and we have that V = Vi\/V2 for this choice ofV. 



Proof. 

The existence of V follows by first using Assumption 2. This gives E'lH E2 = $ 
for events E'l and E'2 in some single experiment. We can then take E = E'lL) E'2, so 
that follows. 

It is clear that V > Vi and V > V2 for this choice of V. Assume that Vq is 
another proposition such that Vq > Vi and Vq > V2. Then by Assumption 1 and 
Assumption 2 with A; = 3 there exists an experiment 8ai and events E'q,Ei and E2 
in Ta' such that E'{ C E'q, E'^ C E'q and E'{ D E'^ = 0. But then it follows that 
E'{ U E'^ C E'„ P^'^^^Eq) > PgW^E,) + Pg%^{E2) for all 0, so P < Vq. Hence 
V = V1VV2. 
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The basic property of orthomodularity was defined in Section 3 for lattices. This 
definition requires only a definition of supremum for pairs of orthogonal propositions. 
Here is the necessary reformulation: Say that a set of propositions is orthomodular 
if Vi < V2 implies that V2 = Vi\J {V2 V Vi)^. The following result must be seen in 
relation to Lemma 1. Note that neither Assumption 1 nor Assumption 2 are needed 
explicitly in the Theorem. 

Theorem 1. 

Assume an orthocomplemented poset of propositions based on experiments with 
the property that pairwise suprema exist and satisfy equation for the pairwise 
supremum V = (a, E) . Then this poset will he orthomodular. 



Proof. 

Let Vi = {ai,Ei) {i = 1,2) satisfy Vi < V2- Then by assumption, V2 V Vi is 
equal to V = {a,E) such that = ^ " + ^ei(<^)(^i)- ^^"^ same 

assumption Pi VP^ is equal to V = {a',E') such that P^!^^-^{E') = P^^f^^-^{Ei) + 1 - 
PZ.dE) = Pq ,,JE2). Hence V = V2, and the orthomodularity property follows. 



The concept of an orthocomplemented, orthomodular poset has been central to 
the lattice approach to quantum mechanics. As stated before, to arrive at a structure 
that has a Hilbert space representation, further assumptions are needed. We will 
study the consequences of assuming that there exists a symmetry group connected 
to the state of a set of experiments. 



6 Symmetry and permissibility. 

We assume that each sample space Xa is a locally compact topological space, and 
let X = {(a, x) : a ^ A,x ^ Xa} be the collection of all sample spaces. If X is given 
the topology composed of unions of sets of the form (a, Va), where Va is open in Xa, 
then X will also be locally compact. 

Now let G be a group of transformations on X . For single experiments it is 
known (Helland, 1998, and references there) that - if simple consistency properties 
are satisfied - such a group may both simplify the statistical analysis, and allow a 
considerable strengthening of conclusions. We will here discuss consequences for sets 
of potential experiments of the assumption of the existence of a symmetry group. 
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The groups connected to single experiments may be looked upon as subgroups 
of G: Let Ga = {g (z G : ga = a}. Then, by a slight misuse of notation, Ga will be 
considered to be a group of transformations of Xa- 

Now introduce models, i.e., look at the whole specification of the experiment 
£a = i^a^^a^lPe'id S ©a})- is common in statistical models under symmerty, 
it will be assumed that J^a is closed under the action of the elements of Ga, and 
that Ga is given a topology such that the mappings g —>■ g~^, ((71,52) —>■ 5152 and 
(5, x) — > gx are continuous. 

A group Ga of transformations of the parameter space Qa is introduced in the 
natural way by 

PleiE) = PSiQa^E) for E G Ta- 

A basic assumption is that the model is closed under the transformations ga G Ga- 
Another assumption is that in each model the parameter can be identified: 

P§,{E) = P§{E) for ah E Ta implies 9' = 6. (3) 

As earlier it is assumed that all parameters 9 = 9a in Qa are functions of a single 
parameter (p from some space This is the parameter characterizing the state of the 
system, and it is natural to assume that all parameter transformations are generated 
by transformations of (p. 

Assumption 3. 

There is a group Ga of transformations on $ such that the elements ga of Ga all 
have the form 

UOam = Oaiga^). (4) 

The special parametric functions 9{-) = 9a{-) that satisfy a relation of the form 
(Q) for some groups G and Ga played an important role in Helland (1998), where 
they were called invariantly estimable functions. They will be even more important 
here, and they will be referred to several times, also when no estimation is involved, 
so we will simply call these parametric functions permissible. The following results 
summarize some of their main properties: 



Lemma 2. 

(a) For the group Ga, the parametric function 9a{-) is permissible, i.e, there is a 
set of transformations {ga} such that ^ holds, if and only if for each pair {4>',(j)) 

Oa{(f>') = 9a{(t>) implies 9a{g<t}') = 9am for ah g e G. (5) 
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(b) The set of transformations cja described by the relation will necessarily 
constitute a group, and this group is the homomorphic image of the group Ga'- If 
9,9' 9,9', then gg' gg' and g'^ ^ g'^ ■ 

(c) Assume that 9a{ ) is permissible relative to Ga as above, and assume that in 
experiment Sa we have that ri{-) is a permissible function of 9a relative to Ga- Then 
C(-) defined by C{4') = vi^ai<P)) is permissible. 



Proof. 

(a) The general implication (|5|) is equivalent to the requirement that 6a{gcl)) is a 
function of Oa{4>)- 

(b) Straightforward verification. 

(c) We have C{94>) = v{9a{0a{(p))) = ga{'n{9a{4>))) = 9aiC{(p)) for some ga- 



Certain inconsistencies in Bayesian estimation theory are avoided if one concen- 
trates on permissible parametric functions. Under weak additional assumptions on 
the loss function we also have that the best invariant estimator will be equal to the 
Bayes estimator under noninformative prior for such parameters. (Helland, 1998) 

Another homomorphism in the structure described above, is: ga — > ga {Ga — > 
Ga)- It follows that Ga is the homomorphic image both of G and of Ga, which is a 
subgroup of G. 

The following properties of permissible parametric functions are borrowed from 
Helland (1998), but all points are easy to verify: 

(i) The full parametric function (j) ^ (p is permissible. 

(ii) If rj is invariant in the sense that rj{g(j)) = r]{(j)) for all ^ E G and all (/) G 
then r/ is permissible. If ^ = 9a{-) corresponding to an experiment 8a is of this type, 
then 9 is constant (on orbits of G) . 

(iii) If ?7(^) is permissible with range M and 7 is a 1-1 function from M onto 
another space N , then Q given by C(0) = 7('?(0)) is permissible. This is a special 
case of Lemma 2(c). 

(iv) If {rji] i E /} is any set of permissible parametric functions, then 9 given by 
9{(t>) = {Vi{'P)]i £ I) is permissible. 

Note that the functions (p — > Pe((j,) {E) are not necessarily permissible even though 
the functions 9 are. This implication does hold, however, if the parameter 9{-) is 
one-dimensional and P depends montonically on it, since then the relation between 
9 and P wiU be 1-1. 
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7 Group representation related to the parameter space. 



One very important aspect of locally compact groups of transformations is that one 
can define left and right invariant measures (Nachbin, 1965): fJ-igD) = ^{D) and 
v{Dg) = iy{D), where g (z G and D Q G. If <I> is also locally compact, and if the 
action of G on $ satisfies a weak extra condition, then right Haar measure can also 
be defined on $ in a consistent way (Helland, 1998, and references there). It is 
argued in Helland (1998) that from many points of view this right invariant measure 
is the correct one to use as a prior 'distribution' when G expresses the symmetry of 
the problem and no other information is present. 

Linear representation of groups has played an increasing role in quantum me- 
chanical calculations in the last decades, and in fact much of the motivation behind 
recent development of group representation theory as a mathematical discipline has 
been taken from quantum theory. Nevertheless, when it comes to the physical and 
mathematical foundation of quantum theory, little use has been made of group rep- 
resentations. An exception is Bohr and Ulfbeck (1995), where physical aspects are 
emphasized. 

We will concentrate here on the group G of transformations on the basic pa- 
rameter space Assume that $ is endowed with a c-algebra with a separability 
property (Dunford and Schwartz, 1958, p. 169) so that the space H = L^($,z^) of 
complex square integrable functions on $ is separable. We take the measure v as 
right Haar measure. 

The elements g & G generate unitary transformations on Ti by 



Note that these transformations form a group which is the homomorphic image of 
the group G. A major issue in group representation theory is to study invariant 
subspaces under such transformations, in particular to look for irreducible invariant 
subspaces. 

Now recall the permissible parametric functions 9{-) defined on the same space 
There is a natural ordering of these functions: Say that 9'{-) < 9{-) if 9'{-) 
is a function of 9{-), i.e., 9'{(j)) = ip{9{4>)) for some ijj. In classes of statistical 
models/ experiments it may be of some interest to find minima under this ordering 
for certain sets of permissible functions. We will see below that equivalence classes 
of permissible functions are in one-to-one correspondence to subspaces of Tl that are 
invariant under the groupgenerated class of unitary transformations and that the 
ordering above corresponds to the natural ordering of subspaces. This may explain 
to a certain extent why group representation theory is so important in quantum 
mechanics. 




(6) 
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Theorem 2. 

(a) Consider a fixed permissible function 0{-)- The set of functions f of the form 
/(0) = /o(^(0)) constitutes a closed linear subspace of 7i which is invariant under 
the transformations U{g). 

(b) If 01 is a permissible function and 6i -< 9, then the subspace corresponding to 
01 is contained in the subspace corresponding to 0. Conversely, ifV\ corresponds to 
9i, V to 0, and Vi is a subspace ofV, then 9i{ ) ■< 9{-). 

(c) Say that two permissible parametric functions 9{ ) and 9'{-) are equivalent if 
they are 1-1 functions of each other. Then the set of equivalence classes of permis- 
sible functions is in 1-1 correspondence with the subset of invariant subspaces of Ti 
described in (a). 

Proof. 

(a) It is clear that the space is closed under linear combinations, and also under 
infinite sums that converge in L^-norm, so the space is closed. If / belongs to the 
subspace, then 

U{g)f{4>) = f{g-^(t^) = f^{g-\em) 

is also in the subspace. 

(b) Obvious. 

(c) Prom (b) the invariant subsets of equivalent permissible functions must be 
contained in each other. 

For the use of group representation in physics, see for instance Hamermesh (1962); 
a reference to the more general theory is Barut and R§,czka (1977). 

In later developments we will need further properties of the Hilbert space Ti. and 
those closed subspaces VofH that are defined by permissible parametric functions. 
Here is the kind of results that are needed. 



Lemma 3. 

(a) Under weak regularity conditions the parameter group Ga of experiment £a 
will be locally compact and have a right Haar measure Va- Let /(■) be a given function 
on Then there is a unique (almost everywhere with respect to Va) function fa{9) 
such that for all functions c(-) of 9 we have 

I c{9a{4>))f{cl>Hd4>) = J c{9)fa{9)ua{d9). (7) 
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(h) Let Vi CVa C TC, where V\ corresponds to the permissible function ^i(-) and 
Va to Oa{-)- Let fi and fa be the corresponding functions defined in (a). Then, for 
all ci(-) 

/ ci{9i)fi{ei)ui{dei) = J ci{di{6))fa{e)ua{de). (8) 

(c) For all functions g on Qa we have 

J 1/(0) - fa{eam\^I^id<P) < J 1/(0) - 5(ea(0))|'^(#). (9) 

Equality holds here if and only if g{9) = fai^) almost everywhere with respect to the 
measure Vn- 



Proof. 

(a) We know that G is locally compact, and that Ga is the homomorphic image of 
G. Assuming that the function describing the homomorphism is continuous, Ga will 
inherit the topology from G, and then be locally compact. The measure f{(j))u{d(j)) 
will be absolutely continuous with respect to Va{d9); let fa{9) be the Radon-Nikodym 
derivative. 

(b) This follows from well-known properties of Radon-Nikodym derivatives. 

(c) By using (0) on c(0) = g{e) - faiO), we find 

/ |/(0)-<7(ea(0))i'K#)- / !/(0)-/a(^a(0))|'K#)= / \g{e) - fa{e)\^ l^aidO) . 

Equations (0) and (§) show that the function fa{0{<p)) may be regarded as the 
projection of /(</>) on the space determined by the permissible function 6{-), and (^) 
shows that this projection functions as it should under iteration. This appears to 
be the start of a state vector approach to quantum mechanics based on a Hilbert 
space representation of G. We will come back to this approach later, but first we 
will return to the lattice approach, which leads to a (complementary) Hilbert space 
representation of propositions. 



8 Lattice property and Hilbert space. 

From Assumption 3 it is clear that the parameters 6 connected to each single ex- 
periment are permissible functions of the basic (hyper-) parameter (p. An important 
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observation was made in Lemma 2(c): Further parametric functions that are per- 
missible relative to this single experiment, are also permissible relative to (f). Often 
it is natural to design sub-experiments to estimate such parameters. 

The first part of the assumption below can be motivated as follows: If a(-) is a 
function of a set of parameters, and 6(-) is an extension of a using more parameters, 
then usually the values h will be smaller than the values a for some parameter values, 
but in most cases there are also parameter values for which b is larger than a. 

We let the vector space representation of permissible parametric functions from 
the previous section be understood. 

Assumption 4. 

(a) The suprem,um, of a set of propositions {Vi} , if it exists, will he some propo- 
sition V = {a, E) with probability Pq^^-^(E) where the parameter 9 determines a space 
(Theorem 2) contained in the space spanned by those corresponding to the parameters 
9i in each Vi- 

(b) Corresponding to every such parametric function 6 there is an experiment £. 



Our main aim in this section is to show that the supremum of any set of propo- 
sitions can be defined. 

Theorem 3. 

Let {Vi = {ai,Ei);i € /} be a set of propositions, partially ordered under the 
ordering given in Definition 1. Let Assumption 3 and Assumption 4 hold. Then the 
supremum V of this set exists, in the sense that Vi < V for all i and Vi < Vq^^H 
implies V <Vo. 

Proof. 

Assume a set of propositions Vi = {ai,Ei) with probabilities P^*^^^{Ei). What 
we are after, is a proposition V = (a, E) such that 

P^i^^^{Ei) < P^^^^iE) (10) 

for all i and for all 0, and such that V is the minimal proposition satisfying this. 
By Assumption 4(a) we can restrict our search to propositions V with parametric 
function 9 associated with a definite linear space V, the one spanned by the linear 
spaces of the parameters 9i. 
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Now (^0|) for each cj) is equivalent to 

sup,P^l^,^{E,)<P0yE), (11) 

where ipi{0{(j))) = 9i{(j)), which makes sense by the nesting of the hnear spaces and 
by Theorem 2 (b). 

By Assumption 4 (b) there exists at least one experiment with parameter 9. Pick 
one such experiment. Then we can choose the E which minimizes the righthand side 
of (|Tl| ) under this constraint for all 0's. Propositions that give the same solution 
are identified by earlier assumptions. It is then clear that the supremum found is 
unique. 

As a consequence of Theorem 1 and Theorem 3 we now have that, with the 
assumptions made, the set of propositions, constructed in a natural way from sets 
of potential experiment subject to symmetry conditions, form an orthomodular, or- 
thocomplemented lattice. This is the basic entity of the quantum logic approach to 
quantum mechanics. We feel that most of the assumptions made are relatively inno- 
cent; they may perhaps be improved slightly in detail, but they have a more concrete 
interpretation than axioms for quantum mechanics in most existing approaches. 

The remaining conditions from Section 3, atomicity, covering property and sep- 
arability are more technical, and some of them have been controversial in some of 
the quantum logic literature. From a statistical point of view it is obvious that each 
sample space is the union of its atoms, but the corresponding assumption is more 
problematic if events with the same probability for each 9 are identified, as we have 
chosen to do here. And then we also have difficulties with the covering properties 
in models were atoms are undefined. On the other hand, with the identification 
of events, it is usually not problematical to assume that unions of disjoint events in 
each sample space are at most countable, and then by Assumption 2, the separability 
property follows. 

A simple solution is to assume that all experiments are discrete, that is, that each 
sample space Xa is countable. Then all conditions are satisfied, and by arguments in 
Piron (1976), Maeda and Maeda (1970), Varadarajan (1985) and other places, the 
ordinary Hilbert space model of quantum mechanics follows. 

Heuristically, continuous sample spaces may be approximated by discrete sample 
spaces. In fact, the situation at this point may be seen as a reflection of the situa- 
tion in ordinary quantum mechanics, where it is well known that precise treatment 
of continuous variables requires other concepts than the ordinary Hilbert space ap- 
proach, say based on C* algebras. Note that our own underlying framework with 
sets of models for experiments subject to symmetry is conceptually simple in the 
continuous case, too. 
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From the discussion in Beltrametti and Cassinelli (1981) we deduce: 



Theorem 4. 

In the case where all experiments 8a are disrete, and the assumptions above hold, 
there is a complex, separable Hilbert space Ho such that ( assuming that the dimension 
of Tio is > 3) each proposition V = (a, E) can be associated uniquely with a projection 
operator Ila,E in Hq in the sense that 

Ps\^^{E)=traceipna,E), (12) 
where p = p{(f)) is a density operator, a non-negative operator of trace 1. 



Here we have combined the Theorem on existence of the Hilbert space with the 
famous Gleason's Theorem, which states that if the dimension of the Hilbert space 
is 3 or larger, every probability distribution over propositions can be computed in 
this way. Gleason's theorem is very cumbersome to prove (Varadarajan, 1985). We 
will indicate later that the state vector representation of Section 7 probably may be 
used to give a simpler proof of this result. 



Corollary 1. 

The conclusion of Theorem 4 holds under only the Assumptions 1, 3 and 4- 



Proof. 

As it stands. Theorem 4 is deduced using all assumptions 1, 2, 3 and 4. As 
remarked earlier, however, it is always possible to extend the set of propositions so 
that Assumption 2 holds. This extended set of propositions do not occur in the 
statement (12). 



All that have been done up to now could also have been done for a simple case 
of a distributive lattice. Then as in Theorem 4 we would still have been given a 
Hilbert space, but this Hilbert space had been trivial. Hence, a further question to 
ask is if the quantum logic derived from sets of experiments is non-distributive, in 
general. In some sense this is obvious, since if it were distributive, we would have 
an ordinary Boolean algebra, which by a well-known theorem by Stone would imply 
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that everything could be represented as one single experiment. Here is a simple 
example of non-distributivity. More complicated examples can be constructed from 
this. 

Example 1. 

Look at two experiments Ea = {/^^ {FbY) and E\) = (3^, {Qvl)- A,B £ J^, 
A,B X, hut AU B = X, and let C G ^, C 7^ 0. Assume that the two 
experiments are unrelated in the sense that 9 and ip depend on the common pa- 
rameter (p in disjoint parts of its range $. Also assume that PeiA) = Pe{B) = 
Q^{C) = in the areas where these are independent of the parameters. Then 
(a, A'^) V {b, C^) is some proposition (e, D) whose probability for all (f) should dom- 
inate max(Pg(-0)(^'^), Q^((j(,-)(C"^)). But, from the assumptions made, this maximum 
must be 1 for all 0. Hence (a, A'') V (6, C"=) = 1, so (a, A) A {b,C) = 0. Similarly, 
(a, B) A (6, C) = 0, so ((a. A) A (6, C)) V ((a, B) A (6, C)) = 0. On the other hand, 
((a, A) V (a, B)) A (6, C) = {a,A[JB)A {b, C) = 1 A {b, C) = (6, C). So these three 
events do not satisfy the distributive law. 



Finally, one can ask if the quantum logic derived from sets of potential experiment 
is wide enough to cover everything that is of interest in quantum mechanics. Of 
course, it is very ambitious to try to give an answer to such a question, but it is an 
encouraging thought that virtually every attempt that can be made to verify any 
theory on the quantum level has to be based on experiments. Hence it seems very 
difficult to trancend beyond this frame. (However, this argument does perhaps not 
hold for quantum cosmology.) 



9 Observables. 

In Section 7 we introduced a group representation of the transformation group G 
in the parameter space, and looked at some concrete interpretations of that rep- 
resentation in Theorem 2. The Hilbert space of Theorem 4 can in some sense be 
regarded as the corresponding representation of the sample space group G. Even 
though the consequences of this latter representation are discussed in all books in 
quantum mechanics, we will come with some remarks on it here. 

Since we have assumed that all sample spaces Xa are discrete (this is not necessary 
for our basic model, but convenient for the Hilbert space representation), we might 
as well assume that Xa = {1, 2, . . .} for each a. The cr-algebra is then the obvious one, 
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and there is a parametric class of probabilities {Pg(^^) {x);0 S 6a}, adding to 1 for each 
a. As elsewhere it is assumed that the parameter in each experiment is a function 
of a state parameter E The most important assumptions that we have made, 
apart from that the class of experiments should be rich enough, is that pairwise 
orthogonality should imply total orthogonality in the sense of (|l|) and that there 
exists a group structure on <I> with the consistency requirement 6a{g4>) = ga{0a{4')) 
for some (ja- We will also assume here that G is transitive on $; the case with several 
orbits corresponds to superselection rules: Index the orbits with some parameter 
r. Then r is conserved during all symmetry transformations. Using a reasonable 
theory of time development, it can also be shown to be conserved over time. Physical 
variables with this property might be charge, mass or hypercharge. In the further 
discussion there is nothing lost by keeping r fixed, which is the same as sticking to 
one particular orbit, so that G is transitive. 

The simplest quantummechanical interpretation of ([l^) is now that each primitive 
event (a, x) can be represented in a fixed complex separable Hilbert space TCq by a 
one-dimensional projection: For each a we may take a set of orthonormal vectors 
{ca.x}, so that the event {a,x) is represented by the projection ea^x^l^x- concrete 
terms this means that according to Gleason's theorem there exists a density operator 
p such that Pg(^^^{x) = e\^^pea,x- It is not difficult to see that it is always possible to 
find one such p for each experiment; the strong part of this result is that the same 
p can be chosen for all the experiments. 

In each experiment one can of course introduce random variables in the usual 
statistical sense: Y{-), Z{-), . . . are measurable functions on some Xa, and the distri- 
butions of these are determined in the usual way from above. Again, the 
main thing that is new in quantum theory, is that there is some connection between 
variables defined in different experiments, and this connection may have rather large 
consequences. It may be natural, since probabilities are summarized through pro- 
jection operators, to associate random variables originally defined in an experiment 
£a with operators, too: If Y{x) = Ux for x G Xa, then take 

Y = ^yxea,xelx- (13) 

X 

The implication from Gleason's theorem is then that in any state p the expecta- 
tion of Y will be 

(y) =trace(yp). (14) 

Again take p = ee^ for some unit vector e. Then an easy standard calculation shows 
that the variance ((1" — {Y))"^) of Y vanishes if and only if e is an eigenvector of Y 
with some eigenvalue A. Well-known results from quantum theory follow. 

With a more general sample space, the random variable y is a measurable func- 
tion from (r^a, J^a) to the Borel sets 13 on the real line, meaning that Ea = Y~^{B) be- 
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longs to Ta whenever B ^B. In quantum logic, an observable is defined more gener- 
ally as a measurable map from B to the set of propositions {(a, Ea) : a G A, Ea £ J^a}- 
This is a formal way of generalizing the notion of random variables in such a way 
that it makes sense for several experiments, a conceptual idea that may be of interest 
in ordinary statistics, too. Some technical problems associated with this notion of 
observables are discussed in Gudder (1978). 



10 states. 

It is now time to leave the habit of only associating the concept of state with fixed 
values of the hyperparameter (p. Two fundamental observations lead to this: 

1) So far, nothing has been said about the state space except that there should 
be defined a group G on it. From that point of view, there is nothing to prevent us 
from replacing $ by a larger space <!>', constructed such that each (p is a function on 

in the earlier notation: (p ^ (p' . Then little in what has been said is changed if 
each <p is replaced by any parameter value that is mapped upon (p, or more generally, 
by a prior measure on these cp' . 

2) Since the density operator p in ( [l^ ) is a weighted average of pure state density 
operators, it is reasonable to have the lefthand side also as a measure over 'pure 
states' in some sense expressed by (p. It is this argument that we now will try to 
make more precise. 

Again, since we in the derivation of (^) have assumed that each is discrete, 
we might as well replace £' by a singleton {x}. In Section 9 we used Ila,x = ea.xcjj j,, 
but this is not needed here. We will assume that x is nontrivial in the sense that 
P^^j-^j(x) > for at least one (p. 

If we now replace (p by an average according to some prior probability measure 
Tp{-) on the equation reads 

/ ^e4</,)(^)^p(#) = trace(pn,,,). (15) 

Look first at the case where /? = e^e^ is a pure state. Then (||) IS 

/ Pi^^){x)Ti{d<P) = elUa,.e,. (16) 

Since it is clear from (111) and (16) that rp(-) = / X{di)Ti{-) when p = J X{di)eiel, we 



may concentrate on (hq). 

Now turn to the arbitrariness in the choice of $ as remarked in 1) above. A 
natural requirement is that <I> should be chosen in some minimal way, subject to the 
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requirement that it should serve as a hyperparameter space for all the experiments 
in questions. In fact the concept of minimality can be made completely precise we 
have the ordering ^ of parametric functions. 

We will for simplicity assume that each Tj is absolutely continuous with respect 
to the right G-Haar measure i^(-) on 



Theorem 5. 

(a) There is (under technical assumptions) a unique minimal hyperparametric 
space $ so that ( ji^ holds for each a,x and i. 

(b) For each i such that is positive for at least one (a, x) there is a unique hy- 
perparameter (pi such that the measure Ti is concentrated on (pi. The correspondence 
between the pure state vectors Ci and these hyperparameters (pi is given by 

Pl{<t>.)^^)=^^^a,.^i (17) 

Proof. 

(a) Start with one and with respect to this hyperparameter space consider 
the Hilbert space 7i' = L^(<I>', v') as discussed in Section 7. When t[{-) is absolutely 
continuous with respect to Haar measure, the lefthand side of (^) can be written 

/ PSM')i^)f[{4>>\d(P') (18) 

for some function Using the projection defined in Lemma 3, each such expres- 
sion can be projected onto a space Va corresponding to the parametric function 
9'^. Take Ti as the minimal linear space spanned by {V^ : a E A}. We will assume 
regularity conditions (cf, Theorem 2(d)) such that Ti corresponds to a unique (hyper- 
)parametric function (p{(p')- This means, again using Lemma 3, that all dashes can 
be removed from the expression (p^). When Tj'(-) is not absolutely continuous, a 
limiting argument must be used. 

(b) Look at the lefthand side of equation ([l^ ) when $ is minimal, and fix i. 
Assume that Tj(-) can not be chosen as Dirac measure. Then, by projecting upon 
all the spaces corresponding to the Oa{-), there is a nontrivial measure left which do 
not concern any of the parameters, contradicting the assumption that <I> is minimal. 
Thus Ti(-) has to be Dirac on some value (pi. This means that (0) holds. 



25 



The result of Theorem 5 estabhshes a connection between the two Hilbert spaces 
of this paper, and is in some sense close to giving an independent proof of Gleason's 
theorem. And it also gives a link to ordinary statistical models. 

Suppose that an ideal measurement has been done. This means that some exper- 
iment £a has been performed, and this has been so accurate that we afterwards can 
assume that the corresponding parameter is exactly determined: 0a(0) = ^o- From 
a statistical point of view is clear that such a measurement must have consequences 
for what is predicted in future measurement, and hence also on the state. 

Assume that the state before measurement is given by a density matrix /?, hence 
a prior measure rp(-), so that the result x in experiment £a is given by both sides 
of (^). After the measurement is done, the new state - i.e., the new measure - 
should be restricted to = : (^a{4') = ^o}- By suitably parametrizing (we 
can find a permissible parametrization since 6a is permissible) this again amounts 
to a projection in the sense of Lemma 3. The well-known Gleason Theorem solution 
emerges, namely to replace p by vrpvT, normalized, where vr projects on the set of 
state vectors corresponding to those (j) for which 6a{(t)) = Oq. 

In some early quantum mechanical literature, such results were comprehended as 
somewhat mysterious, as reflected in the term 'collapse of the wave packet'. Prom 
a statistical point of view it is well known that models must be changed when new 
information is obtained. Bayesian statistics has put this way of thinking into a 
system, also for non-ideal measurements. However, the idea is well-known also in 
non-Bayesian statistics. 



11 Measurement theory: Statistical inference theory. 

The measurement theory of von Neumann (1955) has been criticized by several 
authors, mainly because of difficulties with giving a precise division between the 
microscopic system and the macroscopic measuring device. 

The ordinary statistical approach seems to be able offer a well documented, well 
understood and extremely well tested solution to the measurement problem, albeit 
the extent to which this theory can be applied directly to physical experiments 
remains to be fully debated: Suppose that some specific (statistical) experiment 
£a has been performed. Statistical inference theory to estimate the parameter 9 of 
such an experiment is far developed, see for instance Lehmann (1983). A difficulty 
is that this does not usually determine the state (p. This may however be solved 
using an objective Bayes approach; see below. In Section 4 above the extreme set 
^0 of experiments was defined. A reasonable conjecture is that under regularity 
conditions, when an experiment belongs to there is 1-1 correspondence between 
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9 and (p. This seems to be supported by observing the Hilbert space solution for this 
case. 

In Malley and Hornstein (1993) statistical inference specifically for the quantum 
situation is discussed. 

In the present context it is natural to make use of the symmetry properties of the 
quantum situation, also in the inference phase. This was done in a general setting 
in Helland (1998). We will only briefly recapitulate one main results adapted to the 
quantum theory solution proposed here. 

Assume that to start with we have no information about the system, so that 
the Haar prior u on ^ with respect to the basic group G is used. Then we perform 
an experiment £a with parametric function 9{(j)). The experiment results in the 
measurement of a random variable X - which is taken as a point in the sample space, 
and can be multidimensional or even more general. We assume that the group Ga 
on the sample space Xa of the experiment is known. For simplicity it is assumed 
that this group is transitive, and also that G is transitive on $ (no superselection 
rules). In general one can condition upon orbits. The probability density on the 
sample space Xa is known as a function of (p, for simplicity this is taken as a density 
P(f,{x) with respect to the measure i^a on Xa generated by Haar measure u'^ on Ga', 
that is, Pg,^^)iE) = J^p^{x)va{dx) for all E. 

The main question is now how cj) shall be estimated; i.e., one wants an estimator 
0(X) which is as close as possible to cj). To specify what we mean by closeness, we 
need to specify a loss function. Assume for simplicity that (j) \s a, vector parameter 
(in some R'^), and that quadratic loss is used. Then the solution is 

J p^[x)v{d<t)) 

This solution is best in two different ways: It is the best equivariant ('permissible') 
estimator, and it minimizes the Bayes loss, that is, the expected loss with respect to 
the aposteriori distribution when the prior is Haar as above. 

By consistency, 9 = 9 {(f)) must have a similar formula with P(i,{x) replaced by 
qe{x), where P(j,{x) = qe{(p){x). On the other hand, it seems reasonable, and it is in 
fact optimal, to estimate any r/((/>) by the formula ([T^), where (p under the integral 
in the numerator is replaced by ?/(</>). This may seem to give two different formulae 
for 9{x), but the two formulae are equal by a straightforward use of Lemma 3. Note, 
however, that we do not have 9 = 6* ((/>); this relation is only approximately satisfied 
when one has large amounts of data. 
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12 Time evolution and the role of Planck's constant. 



So far the description has been static; we wih now try briefly to take time develop- 
ment into account. The point of departure is that we have defined the state of the 
system as some parameter ^ in some space In practice this state can develop with 
time in many different ways. A simple and physically plausible assumption is that 
there is a continuous group K acting on $ such that (p after time t changes to 
where kt & K. 

In Section 7 we discussed a unitary representation of the symmetry group G of 
$ on a Hilbert space 7i. Assume that one can find a unitary representation of K 
also on H. Since K is a continuous group, a well-known theorem by Stone implies 
that the representation then must be of the form 

U{t) = e^^* 

for some selfadjoint operator A, perhaps after rescaling time. As is demonstrated in 
several books, taking A = —H/fi here, where fi is Planck's constant and H is the 
Hamiltonian, leads to the Schrodinger equation. 

This is the first time that Planck's constant appears in this paper, a rather signif- 
icant observation. Everything that has been said on stochastical models, and most 
of what we have said about symmetry, apply also to large-scale problems. Planck's 
constant appears only when we meet the following fact, most recently commented on 
by Alfsen and Schultz (1998): Physical variables play a dual role, as observables and 
as generators of transformation groups. It is when observators are used as generators 
of groups the proportionality constant fi occurs. One example is the Hamiltonian 
and the Schrodinger equation, another example is the translation group in some di- 
rection, say the x-axis, whose unitary representation is of the form exp{ipx/h), with 
Px being the momentum in the x-direction. 



13 A large scale example. 

Most socalled paradoxes of quantum mechanics seem to disappear in the statistical 
setting described here. Take the Einstein-Podolsky-Rosen paradox for example (in 
the form proposed by David Bohm): A composite particle with spin disintegrates 
into two single particles, whose spin components in any direction then necessarily 
add to 0, hence are opposite. Sometimes the fact that an observation of one parti- 
cle implies the knowledge of the spin component of the other particle in the same 
direction, is taken as some action at a distance. One point is that the observer of 
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the first particle is free to choose the direction in which he wants to measure. In the 
language of the present paper this observer chooses one particular experiment. No 
paradox appears when we have a statistical interpretation of experiments in mind. 

The following macroscopic example seems to be fairly similar: An organizer O 
sends two envelopes I and II to each of two persons A and S, situated at different 
places. In envelope I he has the choice between a red card or a black card; either 
red to A or black to B or the opposite. In envelope II he has the choice between 
a card with the letter 1 or a card with the letter 2; either 1 to ^ and 2 to S or 
the opposite. Now O for simplicity chooses to have probability 1/2 for red to A in 
envelope I, and he also chooses probability 1/2 for 1 to ^ in envelope II. However, 
he may insist that there should be some correlation 7 between the two envelopes to 
A, for instance defined by 

P(red,l) = i(l+7), 

where —1 < 7 < 1. Correlation at S, defined in the same way, will then necessarily 
also be 7. 

The initial state of the system is then determined, since we know the probability 
distribution of any experiment that A and B could choose to do. To make completely 
concrete that some randomness is involved, one can imagine several independent 
pairs (Ai, . . . , all in the same state, that is, under the same joint 

probability distribution generated by O. 

Imagine now that the following happens: A is given the order that he should 
open one and only one envelope. Once he has opened one envelope, the other one 
is destroyed (by some mechanism which is irrelevant here). A is therefore given 
the choice between two noncompatible experiment, exactly the situation discussed 
in this paper. If he opens envelope I and finds a red card, he can make one set of 
predictions concerning the content of i?'s two envelopes; if he opens envelope II and 
sees the number 2, he can make another set of predictions, etc.. The situation is at 
least in some sense similar to the EPR-experiment. There is no action at a distance, 
only predictions conditioned on different information. 

As the experiment is described above, however, the well known inequalities of Bell 
(1966) will always be satisfied, since the knowledge posessed by O can be regarded 
as a hidden variable. But the example can be extended in several directions, thus 
possibly making it more equal to its quantummechanical counterparts. 

The following concept is of interest in general, and specifically in this example: 
Pick a set Ai of potential experiments. Then the set {(a, Og) '■ a G .4i} may be called 
permissible if there is a group Gi on A\ such that (i) g\a ^ Ai whenever a ^ A\ and 
gi G Gi, and (ii) (51a, 9gia{94>)) is a function of (a, Oa{(f))) for all choices of gi,g. This 
is equivalent to permissibility of the parameter in a compound experiment, where (j) 
is augmented by the 'parameter' a, and the group of this new (p is given by {{gi,g)}. 
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In the present example, the 3 transformations corresponding to Pe{r) Peip) 
(envelope I), Pe{'^) (envelope II) and I ^ II (change envolopc) constitute 

such a structure. The smallest homomorphic image of 5*4 (the permutation group of 
4 elements) which supports this structure, is S3, which is noncommutative and has 
an interesting 2-dimensional irreducible representation. 

Several concepts of this paper may be illustrated on this simple example, if 
necessary by using the structure just described. Alternatively, the example could 
have been made more complicated to start with, so that each experiment supported a 
group with a nontrivial irreducible representation. These (not quite trivial) exercises 
will be left to the interested reader. One extension will be discussed in Helland 
(1999). 



14 Discussion. 

Throughout this paper we have stated some assumptions; in addition there are sev- 
eral implicit assumptions of more technical/ mathematical type. Though most of 
these assumptions are rather plausible, we are not completely sure that all of them 
will be satisfied in all details in a future hypothetical theory. Their main function 
here have been to pave the way to ordinary quantum mechanics from what is our 
basic framework: Set of models for experiments with a common state space, subject 
to some symmetry group. 

Our conceptual framework works both for continuous and for discrete variables, 
and, apart from the discussion of time evolution, there is not much non-rclativistic 
about it. A very interesting - but probably not trivial - question is therefore to 
what extent it can be developed further in a relativistic setting, perhaps even with 
the kind of symmetry groups that are natural to think of in the contexts of general 
relativity. 

Of some interest in this connection are the views expressed by Dirac (1978): 
He wanted relativistic quantum mechanics to take another direction, and explicitly 
mentioned group representations of the Poincare group as a clue to this direction. 

In the same paper, Paul Dirac expressed the following opinion: 'One should keep 
the need for a sound mathematical basis dominating one's search for a new theory. 
Any physical or philosophical idea that one has must be adjusted to the mathematics. 
Not the other way around.' 

I strongly agree with Dirac that any fundamental physical theory should have a 
sound mathematical basis. In fact, much of the motivation behind the present paper 
has been to find just that. However, every mathematical theory building must be 
based on a set of axioms. It may well be that the axioms found by mathematicians 



30 



using aesthetic or quasi-logical criteria are among the right ones to serve as a basis 
for a physical theory, but these criteria alone often leaves one with too much arbi- 
trariness. In fact, and unfortunately, much formalistic theory has been the result 
of only taking such points of departure. To find the basis for a new theory, one 
must endeavor to look in all directions, draw upon all sources whether they relate 
to common sense, experience, mathematics, philosophy or large scale physics. 

This implies extremely difficult requirements on the developer of theories, and 
intuitive reasoning and even imprecise arguments seem to be unavoidable at inter- 
mediate stages of the development. Having said that, however, at the final stage 
mathematical rigor should be imperative. 

It can be inferred from the ideas of this paper that much of what has been 
said about modelling in microcosmos also can be transferred to macrocosmos. In 
fact, large scale examples showing quantum mechanical behavior has recently been 
demonstrated in the physical literature (Aerts, 1996). As indicated above: Several 
examples of this type can be constructed. 

Having stressed the similarities between models of quantum theory and those of 
statistics in this paper, it should also of course be underlined that there are basic 
differences. In statistics the parameters are always regarded as unknown and subject 
to inference; in quantum physics the states/ parameters can often be regarded as 
known. Also, the rich set of symmetries that is typical for all applications of quantum 
theory, seldom find their counterpart, at least not to a similar degree, in large scale 
statistics. 

Nevertheless, even though it is not a main issue here, it should be mentioned that 
some of the ideas behind this paper also seem to have impact also for issues discussed 
in ordinary statistical science. The question of how to condition statistical models 
gives another meaning to the label 'a' in Sa = {Xa,J^a,{P$]0 G ©a}) (Hclland, 
1995, and references there). The link between this issue and quantum mechanics 
was suggested already by Barndorff-Nielsen (1995). 

In this paper, and in most statistical papers, a mathematical structure like 6a 
above is connected to a real, physical or otherwise, experiment. But in addition, 
since {Pg} is included in the structure, it can be used to denote different models for 
the same real experiment. This idea together with the symmetry discussion of the 
present paper can in fact be used to discuss model reduction in statistics. We hope 
to develop these ideas further elsewhere. 
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